Knowledge Base Augmentation using Tabular Data

نویسندگان

  • Yoones A. Sekhavat
  • Francesco Di Paolo
  • Denilson Barbosa
  • Paolo Merialdo
چکیده

Large linked data repositories have been built by leveraging semi-structured data in Wikipedia (e.g., DBpedia) and through extracting information from natural language text (e.g., YAGO). However, the Web contains many other vast sources of linked data, such as structured HTML tables and spreadsheets. Often, the semantics in such tables is hidden, preventing one from extracting triples from them directly. This paper describes a probabilistic method that augments an existing knowledge base with facts from tabular data by leveraging a Web text corpus and natural language patterns associated with relations in the knowledge base. A preliminary evaluation shows high potential for this technique in augmenting linked data repositories.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Search in Tabular Structures

The Semantic Web search aims to overcome the bottleneck of finding relevant information using formal knowledge models, e.g. ontologies. The focus of this paper is to extend a typical search engine with semantic search over tabular structures. We categorize HTML documents into topics and genres. Using the TARTAR system, tabular structures in the documents are then automatically transformed into ...

متن کامل

Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings

Œis paper describes an abstractive summarization method1 for tabular datawhich employs a knowledge base semantic embedding to generate the summary. Assuming the dataset contains descriptive text in headers, columns and/or some augmenting metadata, the system employs the embedding to recommend a subject/type for each text segment. Recommendations are aggregated into a small collection of super t...

متن کامل

Improving Open Data Usability through Semantics

With the success of Open Data a huge amount of tabular data become available that could potentially be mapped and linked into the Web of (Linked) Data. The use of semantic web technologies would then allow to explore related content and enhanced search functionalities across data portals. However, existing linkage and labeling approaches mainly rely on mappings of textual information to classes...

متن کامل

Development of ICD-10-TM ontology for a semi-automated morbidity coding system in Thailand.

OBJECTIVES The International Classification of Diseases and Related Health Problems, 10th Revision, Thai Modification (ICD-10-TM) ontology is a knowledge base created from the Thai modification of the World Health Organization International Classification of Diseases and Related Health Problems, 10th Revision. The objectives of this research were to develop the ICD-10-TM ontology as a knowledge...

متن کامل

Visual Design and On-line Verification of Tabular Rule-Based

The paper is dedicated to presentation of a new approach to joint design and verification of rule-based systems. The principal idea is that verification should be performed on-line, incrementally, during system design. This allows for early detection and handling of knowledge base anomalies and inconsistencies. The proposed approach offers also an innovative visual tool for computer-aided desig...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014